Uncertainty sampling methods for one-class classifiers

نویسنده

  • Piotr Juszczak
چکیده

Selective sampling, a part of the active learning method, reduces the cost of labeling supplementary training data by asking for the labels only of the most informative, unlabeled examples. This additional information added to an initial, randomly chosen training set is expected to improve the generalization performance of a learning machine. We investigate some methods for a selection of the most informative examples in the context of one-class classification problems (OCC) i.e. problems where only (or nearly only) the examples of the so-called target class are available. We applied selective sampling algorithms to a variety of domains, including realworld problems: mine detection and texture segmentation. The goal of this paper is to show why the best or most often used selective sampling methods for twoor multi-class problems are not necessarily the best ones for the one-class classification problem. By modifying the sampling methods, we present a way of selecting a small subset from the unlabeled data to be presented to an expert for labeling such that the performance of the retrained one-class classifier is significantly improved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

طبقه‌بندی آریتمی‌های قلبی مبتنی بر ترکیب نتایج شبکه‌های عصبی با نظریه شواهد دمپستر- شفر

Cardiac arrhythmias are one of the most common heart diseases that may cause the death of the patient. Therefore, it is extremely important to detect cardiac arrhythmias.  3 categories of arrhythmia, namely, PAC, PVC, and normal are considered in this paper based on classifier fusion using evidence theory. In this study, at first a sample is carrying out the ECG signal with 250 point.  Moreove...

متن کامل

Heterogenous Uncertainty Sampling for Supervised Learning

Uncertainty sampling methods iteratively request class labels for training instances whose classes are uncertain despite the previous labeled instances. These methods can greatly reduce the number of instances that an expert need label. One problem with this approach is that the classifier best suited for an application may be too expensive to train or use during the selection of instances. We ...

متن کامل

Heterogeneous Uncertainty Sampling for Supervised Learning

Uncertainty sampling methods iteratively request class labels for training instances whose classes are uncertain despite the previous labeled instances. These methods can greatly reduce the number of instances that an expert need label. One problem with this approach is that the classifier best suited for an application may be too expensive to train or use during the selection of instances. We ...

متن کامل

A New Hybrid Framework for Filter based Feature Selection using Information Gain and Symmetric Uncertainty (TECHNICAL NOTE)

Feature selection is a pre-processing technique used for eliminating the irrelevant and redundant features which results in enhancing the performance of the classifiers. When a dataset contains more irrelevant and redundant features, it fails to increase the accuracy and also reduces the performance of the classifiers. To avoid them, this paper presents a new hybrid feature selection method usi...

متن کامل

Comparison of Data Sampling Approaches for Imbalanced Bioinformatics Data

Class imbalance is a frequent problem found in bioinformatics datasets. Unfortunately, the minority class is usually also the class of interest. One of the methods to improve this situation is data sampling. There are a number of different data sampling methods, each with their own strengths and weaknesses, which makes choosing one a difficult prospect. In our work we compare three data samplin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003